35 research outputs found
The Universal Approximation Property
The universal approximation property of various machine learning models is
currently only understood on a case-by-case basis, limiting the rapid
development of new theoretically justified neural network architectures and
blurring our understanding of our current models' potential. This paper works
towards overcoming these challenges by presenting a characterization, a
representation, a construction method, and an existence result, each of which
applies to any universal approximator on most function spaces of practical
interest. Our characterization result is used to describe which activation
functions allow the feed-forward architecture to maintain its universal
approximation capabilities when multiple constraints are imposed on its final
layers and its remaining layers are only sparsely connected. These include a
rescaled and shifted Leaky ReLU activation function but not the ReLU activation
function. Our construction and representation result is used to exhibit a
simple modification of the feed-forward architecture, which can approximate any
continuous function with non-pathological growth, uniformly on the entire
Euclidean input space. This improves the known capabilities of the feed-forward
architecture
Deep Learning in a Generalized HJM-type Framework Through Arbitrage-Free Regularization
We introduce a regularization approach to arbitrage-free factor-model
selection. The considered model selection problem seeks to learn the closest
arbitrage-free HJM-type model to any prespecified factor-model. An asymptotic
solution to this, a priori computationally intractable, problem is represented
as the limit of a 1-parameter family of optimizers to computationally tractable
model selection tasks. Each of these simplified model-selection tasks seeks to
learn the most similar model, to the prescribed factor-model, subject to a
penalty detecting when the reference measure is a local martingale-measure for
the entire underlying financial market. A simple expression for the penalty
terms is obtained in the bond market withing the affine-term structure setting,
and it is used to formulate a deep-learning approach to arbitrage-free affine
term-structure modelling. Numerical implementations are also performed to
evaluate the performance in the bond market.Comment: 23 Pages + Reference
Universal Regular Conditional Distributions
We introduce a general framework for approximating regular conditional
distributions (RCDs). Our approximations of these RCDs are implemented by a new
class of geometric deep learning models with inputs in and
outputs in the Wasserstein- space . We find
that the models built using our framework can approximate any continuous
functions from to uniformly on
compacts, and quantitative rates are obtained. We identify two methods for
avoiding the "curse of dimensionality"; i.e.: the number of parameters
determining the approximating neural network depends only polynomially on the
involved dimension and the approximation error. The first solution describes
functions in which can be
efficiently approximated on any compact subset of . Conversely,
the second approach describes sets in , on which any function in
can be efficiently approximated.
Our framework is used to obtain an affirmative answer to the open conjecture of
Bishop (1994); namely: mixture density networks are universal regular
conditional distributions. The predictive performance of the proposed models is
evaluated against comparable learning models on various probabilistic
predictions tasks in the context of ELMs, model uncertainty, and
heteroscedastic regression. All the results are obtained for more general input
and output spaces and thus apply to geometric deep learning contexts.Comment: Keywords: Universal Regular Conditional Distributions, Geometric Deep
Learning, Measure-Valued Neural Networks, Conditional Expectation,
Uncertainty Quantification. Additional Information: 27 Pages + 22 Page
Appendix, 7 Table